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Abstract — We consider a multi-agent optimization problem 
where agents aim to cooperatively minimize a sum of local 
objective functions subject to a global inequality constraint and 
a global state constraint set. In contrast to existing papers, we 
do not require that the objective, constraint functions, and state 
constraint sets are convex. We propose a distributed approximate 
dual subgradient algorithm to enable agents to asymptotically 
converge to a pair of approximate primal-dual solutions over 
dynamically changing network topologies. Convergence can be 
guaranteed provided that the Slater's condition and strong 
duality property are satisfied. 

I. Introduction 

Recent advances in computation, communication, sens- 
ing and actuation have stimulated an intensive research in 
networked multi-agent systems. In the systems and control 
community, this has been translated into how to solve global 
control problems, expressed by global objective functions, 
by means of local agent actions. More specifically, problems 
considered include multi-agent consensus or agreement |4|, 
GO], 02, ED, ED, E3, coverage control 0, ©, formation 
control 0, lf25l and sensor fusion (28l . 

In the optimization community, a problem of focus is to 
minimize a sum of local objective functions by a group of 
agents, where each function depends on a common global de- 
cision vector and is only known to a specific agent. This prob- 
lem is motivated by others in distributed estimation [20 1 |27|, 
distributed source localization E3l . and network utility max- 
imization [13]. More recently, consensus techniques have 
been proposed to address the issues of switching topologies 
in networks and non-separability in objective functions; see 
for instance 03], flU, flU, El, El- More specifically, 
the paper [18| presents the first analysis of an algorithm 
that combines average consensus schemes with subgradient 
methods. Using projection in the algorithm of [18|, the au- 
thors in [19] further solve a more general setup that takes 
local state constraint sets into account. Further, in ||29l we 
develop two distributed primal-dual subgradient algorithms, 
which are based on saddle-point theorems, to analyze a 
more general situation that incorporates global inequality and 
equality constraints. The aforementioned algorithms are exten- 
sions of classic (primal or primal-dual) subgradient methods 
which generalize gradient-based methods to minimize non- 
smooth functions. This requires the optimization problems 
under consideration to be convex in order to determine a global 
optimum. 
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The focus of the current paper is to relax the convexity 
assumption in [29]. The challenges induced by the presence 
of non-convexity will be circumvented by the integration of 
Lagrangian dualization and subgradient schemes. These two 
techniques have been popular and efficient approaches to 
solve large-scale, structured convex optimization problems, 
C-g-, El, El- However, subgradient methods do not auto- 
matically generate primal solutions for nonsmooth convex 
optimization problems, and numerous approaches have been 
designed to construct primal solutions; e.g., by removing the 
nonsmoothness 11261 . by employing ascent approaches 031, 
and the generation of ergodic sequences 031 . 071 . 

Statement of Contributions. Here, we investigate a multi- 
agent optimization problem where agents are trying to min- 
imize a sum of local objective functions subject to a global 
inequality constraint and a global state constraint set. The ob- 
jective and constraint functions as well as the state-constraint 
set could be non-convex. A distributed approximate dual sub- 
gradient algorithm is introduced to find a pair of approximate 
primal-dual solutions. Specifically, the update rule for dual 
estimates combines an approximate dual subgradient scheme 
with average consensus algorithms. To obtain primal solutions 
from dual estimates, we propose a novel recovery scheme: 
primal estimates are not updated if the variations induced by 
dual estimates are smaller than some predetermined thresh- 
old; otherwise, primal estimates are set to some solutions 
in dual optimal solution sets. This algorithm is shown to 
asymptotically converge to a pair of approximate primal-dual 
solutions over a class of switching network topologies under 
the assumptions of the Slater's condition and the strong duality 
property. 

II. Problem formulation and preliminaries 

Consider a networked multi-agent system where agents are 
labeled by i 6 V := {1, . . . , N}. The multi-agent system 
operates in a synchronous way at time instants fceNU {0}, 
and its topology will be represented by a directed weighted 
graph Q{k) = {V,E(k),A(k)), for fc > 0. Here, A(k) := 
[a*(k)] £ M. NxN is the adjacency matrix, where the scalar 
a j (fc) > is the weight assigned to the edge (j, i), and 
E(k) C V x V \ diag(V) is the set of edges with non-zero 
weights. The set of in-neighbors of agent i at time k is denoted 
by AfiQt) = {j e V | € E(k) and j ^ i}. Similarly, 

we define the set of out-neighbors of agent i at time k as 
M° ut (k) = {j e V | G E(k) and j ^ i}. We here make 
the following assumptions on network communication graphs: 

Assumption 2.1 (Non-degeneracy): There exists a con- 
stant a > such that a\(k) > a, and a*(fc), for i ^ j, 



2 



satisfies a){k) G {0} U [a, 1], for all k > 0. 
Assumption 2.2 (Balanced Communication): Qlt holds 

that J2jev a )( k ) = 1 for all « e 7 and k > 0, and 
E JG y a }( fc ) = 1 for a11 3 S V and k > 0. 

Assumption 2.3 (Periodical Strong Connectivity): 
There is a positive integer B such that, for all fco > 0, the 
directed graph (V, Ujf=To E(ko + k)) is strongly connected. 

The above network model is standard in the analysis of 
average consensus algorithms; e.g., see ||2T1 . 0221 . and dis- 
tributed optimization in lfl9l . 1291 . Recently, an algorithm is 
given in [8 1 which allows agents to construct a balanced graph 
out of a non-balanced one under certain assumptions. 

The objective of the agents is to cooperatively solve the 
following primal problem (P): 

min > fi(z), 

lEV 

i(z) < o, z e X, 



s.t. g{ 



(1) 



where z G R™ is the global decision vector. The function 
fi : R™ — > R is only known to agent i, continuous, and referred 
to as the objective function of agent i. The set X C R", the 
state constraint set, is compact. The function g : R n — > R m 
are continuous, and the inequality g(z) < is understood 
component-wise; i.e., gt{z) < 0, for all I £ {l,...,m}, 
and represents a global inequality constraint. We will denote 

/(*) : = E ie y/i(*) and Y := {z e R" | g{z) < 0}. 
We will assume that the set of feasible points is non-empty; 
i.e., X n Y 7^ 0. Since X is compact and Y is closed, then 
we can deduce that X n Y is compact. The continuity of / 
follows from that of /j. In this way, the optimal value p* of 
the problem (P) is finite and X*, the set of primal optimal 
points, is non-empty. Throughout this paper, we suppose the 
following Slater's condition holds: 

Assumption 2.4 (Slater's Condition): There exists a vec- 
tor z G X such that g(z) < 0. Such z is referred to as a Slater 
vector of the problem (P). 

Remark 2.1: All the agents can agree upon a common 
Slater vector z through a maximum-consensus scheme. This 
can be easily implemented as part of an initialization step, 
and thus the assumption that the Slater vector is known to 
all agents does not limit the applicability of our algorithm. 
Specifically, the maximum-consensus algorithm is described 
as follows: 

Initially, each agent i chooses a Slater vector z,(0) G X 
such that g(zi(0)) < 0. At every time k > 0, each agent i 
updates its estimates by using the following rule: 



Zi(k + 1) 



max z-j(k). 



(2) 



where we use the following relation for vectors: for a, b G R™, 
a < b if and only if there is some I G {1, . . . , n — 1} such 
that a K — b K for all k < £ and ai < bg. 

The periodical strong connectivity assumption 12.31 ensures 
that after at most (N — X)B steps, all the agents reach the 
consensus; i.e., Zi(k) = max je y z j(0) for all k > (N — l)B. 
In the remainder of this paper, we assume that the Slater vector 
z is known to all the agents. • 

'It is also referred to as double stochasticity. 



In ll29l . in order to solve the convex case of the problem 
(P) (i.e.; fi and g are convex functions and X is a convex set), 
we propose two distributed primal-dual subgradient algorithms 
where primal (resp. dual) estimates move along subgradients 
(resp. supgradients) and are projected onto convex sets. The 
absence of convexity impedes the use of the algorithms in [29| 
since, on the one hand, (primal) gradient-based algorithms are 
easily trapped in local minima.; on the other hand, projection 
maps may not be well-defined when (primal) state constraint 
sets are non-convex. In this paper, we will employ Lagrangian 
dualization to circumvent the challenges caused by non- 
convexity. 

Towards this end, we construct a directed cyclic graph 
Qcyc '■= (V,E cyc ) where |P cyc | = N. We assume that each 
agent has a unique in-neighbor (and out-neighbor). The out- 
neighbor (resp. in-neighbor) of agent i is denoted by ijj 
(resp. ijj). With the graph Q cyc , we will study the following 
approximate problem of problem (P): 



min y 



s.t. g(xi) < 0, , —Xi + x iD - A < 
Xi — Xi D — A < 0, Xi G X, Vi € V, 



(3) 



where A := 51, with 6 a small positive scalar, and 1 is the 
column vector of n ones. The problem ® reduces to the 
problem (P) when 5 = 0, and will be referred to as problem 
(Pa). Its optimal value and the set of optimal solutions will be 
denoted by p\ and X%, respectively. Similarly to the problem 
(P), p* A is finite and X A ^ 0. 

Remark 2.2: The cyclic graph Q cyc can be replaced by any 
strongly connected graph. Each agent i is endowed with two 
inequality constraints: xi — Xj — A < and — x\ + Xj — A < 
0, for each out-neighbor j. For notational simplicity, we will 
use the cyclic graph Q cyc , which has a minimum number of 
constraints, as the initial graph. • 

A. Dual problems 

Before introducing dual problems, let us denote by Ej := 

R^ x R™^ x R™^, 5 := R™^ x Rt^f x Rt^f, & := 
(pi,X,w) e Sj, £ := (fj,,X,w) e S and x :— (xi) £ X N . 
The dual problem (Pa) associated with (Pa) is given by 

max Q(/t, A, w), s.t. p, A,u>>0, (4) 

X,w 

where fx := (p t ) € W mN , A := (A,) G W lN and w := G 
M. nN . Here, the dual function Q : S — > R is given as 

Q(£) = Q((j,, A, w) := inf £(x,p,\,w), 

x£X N 

where C : W lN x 5 — > R is the Lagrangian function 

C(X, f ) EE C(X, fi, A, w) := ^ (fi( X i) + (Mi: 9 ( x i)) 

+ (Xi, -Xi + x iD - A) + (wi,Xi - x iD - A)). 

We denote the dual optimal value of the problem (Pa) by d% 
and the set of dual optimal solutions by D* A . In what follows 
we will assume that the duality gap is zero. 
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Assumption 2.5 (Strong duality): For the introduced 
problems (Pa) and (-Da), it holds that p* A — d* A . 

We endow each agent i with the local Lagrangian function 
d : R n x Hj — > K and the local dual function Qi : -> K 
defined by 

Ci(xi,£i) := /i(aii) + {^i,g{xi)) + (-A< + A^x*) 
+ (twj - w iv ,Xi) - (Xi,A) - (wi,A), 
Qi(Zi) := inf Ci{xi^i). 

In the problem (Pa), the introduction of approximate con- 
sensus constraints — A < Xj — £i D < A, i G V, renders the /j 
and 5 separable. As a result, the global dual function Q can 
be decomposed into a simple sum of the local dual functions 
Qi. More precisely, the following holds: 



Q(0= inf „ T", (fi( x i) + {(H,g(xi)) 

x£X N z — ' 



+ (Xi, -Xi + Xi D - A) + (wi,Xi - x ir> - A)) 



inf y 

X&X N t-J 



(fi(xi) + (ni,g(xi)) 



+ (-Xi + X ia ,Xi) + (wi - w iLr ,Xi) - (Xi, A) - (wi, A)) 



+ (-Ai + X lu ,x l ) + (wi - WiuiXi) - (A», A) - (w ( , A)) 

= (5) 

It is worth mentioning that Yliev Qi(£i) * s n °t separable since 
Qi depends upon neighbor's multipliers A,^ and Wi v . 

B. Dual solution sets 

The Slater's condition ensures the boundedness of dual 
solution sets for convex optimization; e.g., (9), ifTTl . We will 
shortly see that the Slater's condition plays the same role 
in non-convex optimization. To achieve this, we define the 



function Q 



as follows: 



Qi(Hi,Xi,Wi) = inf (fi(xi) + (fH,g(xi)) 

+ (Xi,~Xi + x iD - A) + (wi,Xi - x iD - A)). 

Let z be a Slater vector for problem (P). Then x = (x~i) G 
X N with Xi = z is a Slater vector of the problem (Pa)- 
Similarly to (3) and (4) in (29), which make use of Lemma 3.2 
in the same paper, we have that for any /j,i,Xi,Wi > 0, it holds 
that 

where /3(z) := min{min^ e .n m i — g^(z),S}. Let /i^, Ai and 
Wi be zero in ©, and it leads to the following upper bound 
on D* A : 

fi(z)-Qj (0,0,0) 
max § < TV max — — , (7) 

JeD^ iev /3{z) 

where (jj(0,0,0) = in£ X( ^x fi{ x i) and it can be computed 
locally. Since /j and g are continuous and X is compact, it is 



known that Qi is continuous; e.g., see Theorem 1.4.16 in (T). 
Similarly, Q is continuous. Since D* A is also bounded, then 
we have that D* A ^ 0. 

Remark 2.3: The requirement of exact agreement on z 
in the problem P is slightly relaxed in the problem Pa by 
introducing a small positive scalar 5. In this way, on the one 
hand, the global dual function Q is a sum of the local dual 
functions Qi, as in ((5]); on the other hand, D* A is non-empty 
and uniformly bounded. These two properties play important 
roles in the devise of our sequent algorithm. • 

C. Other notation 

Denote by the approximate dual optimal solution set D A := 
{£ G 5 | Q(£) > d* A - Ne}. Similar to Q, we have the 
following upper bound on D e A : 



max ||£|| < TV max 



/i(f)-Qi(0,0,0) + £ 



(8) 



In the algorithm we will present in the following section, 
agents will compute 7i (z) := Mglr|*gA0]±£ . 

Define the set-valued map f2j : S, — > 2 X in the following 
way fii(&) := axgmin :c . gx £ i (a;t,^); i.e., given the set 
fii(£i) is the collection of solutions to the following local 
optimization problem: 



min Ci(xi,£i 



(9) 



Here, ili is referred to as the marginal map of agent i. Since 
X is compact and g are continuous, then Qj(£j) ^ in (O 
for any £j € Hf. In the algorithm we will develop in next 
section, each agent is required to solve the local optimization 
problem (O at each iterate. We assume that this problem (O 
can be easily solved. This is the case for problems of n = 1, 
or fi and g being smooth (the extremum candidates are the 
critical points of the objective function and isolated corners 
of the boundaries of the constraint regions) or having some 
specific structure which allows the use of global optimization 
methods such as branch and bound algorithms. For some e > 
0, we define the set-valued map fi| : Sj — > 2 X as follows: 

njte) : = {xi&x\ Ate, 6) < Qi&) + e}, 

which is referred to as the approximate marginal map of agent 
i € V. 

In the space R™, we define the distance between a point 

z G R" to a set i C M" as dist(z, A) := infyg^ ||z - y||, 
and the Hausdorff distance between two sets A,Bc K™ as 
dist(A, P) := max{sup zgj4 dist(z, P), sup ygB dist(A, y)}. 
We denote by B u (A,r) := {u G W | dist(w,A) < r} and 
P 2M (A,r) := {U G 2" | dist(V,A) < r} where W C M™. 

III. Distributed approximate dual subgradient 

ALGORITHM 

In this section, we devise a distributed approximate dual 
subgradient algorithm which aims to find a pair of approximate 
primal-dual solutions to the problem (Pa)- Its convergence 
properties are also summarized. 
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For each agent i, let Xi(k) £ K" be the estimate of the 
primal solution Xi to the problem (Pa) at time k > 0, 
/ij(fc) £ K> be the estimate of the multiplier on the 
inequality constraint g(xi) < 0, X l (k) £ R"gf (resp. 
w l {k) £ R>^J1 be the estimate of the multiplier associ- 
ated with the collection of the local inequality constraints 
— Xj + Xj D — A < (resp. xj — Xj D — A < 0), for all 
j £ V. We let := (^(k) T , A l (fc) T , w l (fc) T ) T , for i e 7, 
and := (fj,i(k) T ,v\(k) T ,v t w (k) T ) T where u^(fc) := 

E iev a;-(fc)A^(fc) and <(fc) := £ . £y aj.(fc)^(fc). 

The Distributed Approximate Dual Subgradient (DADS, for 
short) Algorithm is described as follows: 

Initially, each agent i chooses a common Slater vector z, 
computes ji(z) and obtains 7 := N max^y %(z) through 
a max-consensus algorithm. After that, each agent i chooses 
initial states Xi(0) £ X and £i(0) £ Sj. 

Agent i updates Xi(k) and £i(k) as follows: 

Step 1. For each k > 1, given Vi(k), solve the local 
optimization problem Q, obtain a solution in Sli(x'i(A:)) and 
the dual optimal value Qi(vi(k)). Produce the primal estimate 
Xi{k) in the following way: if xt(k — 1) £ £lj(vi(k)), then 
Xi(k) = Xi(k — 1); otherwise, choose Xi(k) £ ili(vi(k)). 

Step 2. For each k > 0, generate the dual estimate 1) 
according to the following rule: 

ti(k + l) = P Mt [v i (k) + a(k)V i (k)], (10) 

where the scalar a(k) is a step-size. The supgradient vector of 
agent i is defined as V t (k) := (X>j 1 (fc) T ,X>^(fc) T ,X>^(fc) T ) T , 
where := g{xi{k)) £ R m , X>^(fc) has components 

-A - ^(fc) £ R™, £>\0) ic , := Xi(fc) £ R™, and 
T>\{k),j = G R™ for j £ V \ {i,iu}, while the components 
of V l w (k) are given by: V t w (k) l := -A + x,(k) £ R", 
P£,(*0fc : = ~Xi(k) £ R™, and P*,^ = G R™, for 
.7 £ V \ {i, iu}- The set Mi in the projection map, Pm ; , 
above is defined as Mi := G Sj | ||^|| < 7 + 0} for some 
9 > 0. 

Remark 3.1: In the initialization of the DADS algorithm, 
the quantity 7 is an upper bound on D e A . Note that in Step 1, 
the check Xi(k — 1) G f^(i>i(fc)) reduces to verifying that 
Ci(xi(k-l),Vi(k)) < Qi(vi(k)) + e. Then, only if d(xi(k- 
l),Vi(k)) > Qi(vi(k)) +e, it is necessary to find one solution 
in VLi(vi(k)). That is, it is unnecessary to compute all the 
set tti(vi(k)). In Step 2, since Mi is closed and convex, the 
projection map Pm ( is well-defined. • 

The primal and dual estimates in the DADS algorithm will 
be shown to asymptotically converge to a pair of approximate 
primal-dual solutions to the problem (Pa)- We formally state 
this in the following. 

Theorem 3.1: Consider the problem (P) and the corre- 
sponding approximate problem (Pa) with some <5 > 0. We 
let the non-degeneracy assumption 12.11 the balanced com- 
munication assumption 12.21 and the periodic strong connec- 
tivity assumption 12.31 hold. In addition, suppose the Slater's 
condition 12.41 holds for the problem (P) and the strong 
duality assumption 12.51 holds for the problem (Pa)- Consider 

2 We will use the superscript i to indicate that A l (fc) and w l (k) are estimates 
of some global variables. 



the dual sequences of {/Xi(fc)}, {A 4 (fc)}, {w l (k)} and the 
primal sequence of {xi(k)} of the distributed approximate 

dual subgradient algorithm with the step-sizes {a(k)} satisfy- 

+00 +00 

ing lim a{k) = 0, a{k) = +00, and a(k) 2 < +00. 

fc— > + oo — ' ^ — ' 

k=0 _ fe=0 

Then, there exists a feasible dual pair £ := (jl,X,w) such 
that lim \\/j,i(k) — fk\\ = Q, lim || A l (fc) - A|| = 0, and 

k— >+oo k— >+oo 

lim — w\\ — 0, for all i £ V. Moreover, there 

k— > + oo 

is a feasible primal vector x :— (xi) £ X N such that 
lim — X{\\ — 0, for all i £ V. In addition, is 

k— > + OG 

a pair of approximate primal-dual solutions in the sense that 
d* A - Ne < Q(0 <d* A =p* A < £ ie y frfa) < P * A + Ne. 

The analysis of Theorem 13.11 will be provided in next 
section. Before doing that, we would like to discuss several 
possible extensions of Theorem 13.11 

Firstly, the step-size scheme in the DADS 

algorithm can be slightly generalized to the following: 

+00 +00 

lim cii(k) =0, y ai{k) = +00, y ai{k) 2 < +00, 

k— >+oo ^ — ' ^ — ' 
k=0 k=0 

minai(fc) > C a max a;(fc), where ai(fc) is the step-size of 

agent i at time k and C a £ (0, 1]. 

Secondly, the periodic strong connectivity assumption 12.31 
can be weakened into the eventual strong connectivity assump- 
tion, e.g. Assumption 6.1 in |29l , if Q{k) is undirected. 

Thirdly, each agent can use a different in Step 1 of the 
DADS algorithm, which would lead to replacing Ne in the 
approximate solution by J^iev e »- 

Lastly, each agent i could have different constraint functions 
gi and constraint sets Xi if a Slater vector is known to all the 
agents. For example, consider the case that g is convex, Xi is 
convex and potentially different, and there is a Slater vector 
z £ <~)i<zvXi. Then the solution z to the following problem is 
such that g(z) < g(z) < 0: 

min Ng{z), s.t. z £ X u \fi£V (11) 

Through implementing the distributed primal subgradient 
algorithm in [29|, agents can solve the problem dTTb in a 
distributed fashion and agree upon the minimizer z which 
coincides with a Slater vector. In such a way, Theorem 13.11 
still holds and the corresponding proof is a slight variation of 
those in next section. 

IV. Convergence analysis 

Recall that g is continuous and X is compact. Then there 
are G, H > such that \\g(z)\\ < G and ||z|| < H for all 
z £ X. We start our analysis of the DADS algorithm from the 
computation of supgradients of Qi. 

Lemma 4.1 (Approximate supgradient): If Xi £ fi-(£i), 
then (g(xi) T , (-A - Xi) T , xf,(xi - A) T , -xJ) T is an ap- 
proximate supgradient of Qi at i.e., the following holds for 
any £ H ; : 

<3i(£i) - < (g(xi), /ii - (t2j) + (-A - Xi, Xi - A j; ) 

+ (xi, X io - X iv ) + (xi - A,Wi - Wi) 

+ (-Xi,Wi a -w ia ) +e. (12) 



5 



Proof: The proof is analogous to the computation of dual 
subgradients, e.g., in 0, 0, and omitted here due to the space 
limitation. ■ 

Since f2j(uj(fc)) C f2|(uj(fc)), it is clear that Xi(k) £ 
f2|(ui(fc)) for all fc > 0. A direct result of Lemma |4~T1 is 
that the vector (g(x l (k)) T , (-A - Xi(k)) T , Xi(k) T , (x t (k) - 
A) T , —Xi(k) T ) is an approximate supgradient of Qi at Vi(k); 
i.e., the following approximate supgradient inequality holds 
for any (,eH,: 

QMi) - Qi(vi(k)) < (g(xi(k)),(j,i - Hi(k)) 
+ (-A-x i (k),X i -v{(k) i ) 

+ (Xi(k), X ia - vl( fc )i<y) + (Xi(k) - A, Wi - v l w (k)i) 



i(k),w iL 



(13) 



Now we can see that the update rule of dual estimates in 
the DADS algorithm is a combination of an approximate dual 
subgradient scheme and average consensus algorithms. The 
following establishes that Qi is Lipschitz continuous with 
some Lipschitz constant L. 

Lemma 4.2 (Lipschitz continuity of Qi): There is a con- 
stant L > such that for any £i, £i £ Sj, it holds that 

IIQite)-Qite)||<£||&-&||- 



Proof: Similarly to Lemma |4T| one can show that if Xj £ 
then ( 5 (x 4 ) T , (-A - x t ) T , xf, (x, - A) T , -xf ) T is 
a supgradient of Qi at i.e., the following holds for any 

Q»(&) - Q»(&) < (g(xi),Hi - Mi) + <-A - ii, Aj - A, ; ) 
+ (x i5 A <[7 - A^,) + - A,Wi - tDi) 
+ (-Xi,W iv - w% v ). 

Since ||<7(xj)|| < G and ||xi|| < H, there is i > such that 
Q<&) - < £116 - U\- Similarly, Q^) - &(&) < 

L\\£i — £i\\. The combination of these two relations renders 
the desired result. ■ 

In the DADS algorithm, the error induced by the projection 
map Pm, is given by: 

e l {k) := P Mi [vi(k) + a(k)Vi(k)} - v^k). 

We next provide a basic iterate relation of dual estimates in 
the DADS algorithm. 

Lemma 4.3 (Basic iterate relation): Under the assump- 
tions in Theorem 13.11 for any ((/ij),A, w) £ 5 with 
(/ii,A, w) £ Mi for all i £ V, the following estimate holds 



for all k > 0: 

J2 \\et(k) a{k)V t {k)\\ 2 < oc{kf \mk)\\ 2 
iev iev 

+E(ii^*)-^i 2 -i^( fc+1 )-6ii 2 ) 

iev 

+ 2a ( k ) ^2{(g( x i( k )), Hi(k) - Mi) 
+ (—A — Xi(k), v\(k)i — Ai) 

+ (xi(A;),i4(fc)it, - Ai p ) + ~ A ,<,( fc )* - ^i) 

+ (-Xi(k),vi,(k) iv -w itr )}. (14) 

Proof: Recall that Af, is closed and convex. The proof is 
an application of Lemma 17.11 in the Appendix. ■ 

The lemma below shows that dual estimates asymptotically 
converge to some approximate dual optimal solution. 

Lemma 4.4 (Dual estimate convergence): Under the as- 
sumptions in Theorem 13.11 there exist a feasible dual 
pair £ := ((jli),X,w) such that lim \\iii(k) — jli\\ = 0, 

k— >+qq 



lim ||A l (fc)-A|| 

k^-\-oo 



0, and lim \\w l (k) 



0. Fur- 



thermore, the vector £ is an approximate dual solution to the 
problem (Da) in the sense that d* A — Ne < Q(£) < d* A . 

Proof: By the dual decomposition property (0) and the 
boundedness of dual optimal solution sets, the dual problem 
(Da) is equivalent to the following: 



(£<) 



ieV 



s.t. & £ Mi 



(15) 



Note that Qi is affine and Mi is convex, implying that the 
problem ( fT3T > is a constrained convex programming where the 
global objective function is a simple sum of local ones and 
the local state constraints are compact. 

Since X and Mj are compact, there is some J > which is 
an upper bound of the norm of the last sum on the right-hand 
side of ( TT4-b . In this way, inequality ( TPfl ) leads to: 



iev 

(K') 2 Y,\\V l (K')\\ 2 + 2a(K')J, 



iev 



(16) 



iev 



where K = K' + 1. It is not difficult to see that the sequence 
of {T>i(k)} is uniformly bounded. Since lim a(k) = 0, then 

k— > + oo 

we take the limits on K, and K' in ( fTSI l, and it renders that 
limsupV 116(^0 -&II 2 < Ijminf V Ui(K') -?,;|| 2 . 



- exists. 



Therefore, we have lim > 

fc->+oo ^-^ 

By using this property and taking the limit on both 
sides of ( TBI , we then have lim ||ei(fc)|| =0. By us- 

k— f+oo 

ing Proposition 17.11 in the Appendix, we conclude that the 
consensus on A and w is asymptotically achieved; i.e., 

Km HA'^fc) - A J (fc)|| = and lim |K(fc) - w j (k)\\ = 

k— >+oo k— >-\-oo 

for any i,j £ V. Combining these with the convergence 
of \\£i(k) — 6I| 2 } an( l me closedness of M,, we can 

iev 
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deduce that there exist a feasible dual pair £ := A, w) 

such that lim \\ni(k) - p,i\\ = 0, lim || X l (k) - A|| = 0, 

fc— >+oo fe— 5-+00 

and lim ||ui l (fc) — w\\ — 0, for all i E V. Furthermore, we 

fc— > + 0O 

have Q(i) < d* A . 

Substitute the approximate supgradient inequality ( fT3l i 
into ( TBI , rearrange terms, and we have 

2a(fc) ^(Qite) - Qi(«i(A;)) - e) < £ a(fc) 2 ||A(fc)|| 2 
iev iev 

+ - &f - ||6(* + 1) - fif). (17) 

iev 

Let A(fe) := ^E JeV A*(fc) and u,(fc) := ££« eV «;*(*:)■ 
By Lipschitz continuity of Qi, it follows from ( fTTI i that 

£ 2a(*)(Q i (&) ~ Oi(w(A), A(fc)>(fc)) - e) 
<^a(fc) 2 ||A(fc)|| 2 

+E(ii^ fc )-^i 2 -ii^ fc+1 )-^ii 2 ) 

+ £ 2a(fc)L(||«l(fc) - A(fc)|| + \\vi(k) - w(k)\\). (18) 
iev 

Now we follow a contradiction argument, and state 
£ is not approximate dual optimal. That is, assume 

that X^evQ*(£»>^>«') < d A ~ Then P '■= 

- J2iev Qi(Pii\ w) + d* A - Ne > 0. Let & in dT8j be some 
dual optimal solution. Since lim 1 1 -l?^ (^) — A(fc)|| = and 

k— >+oc 

lim ||w^(fe) — w(fe)|| = 0, there is K' > such that for all 

k— S- + OG 

fc > if', there holds 

ipa(fc)<5>(fc) 2 ||2Mfc)|| 2 

i6V 

+E(ii^)-f*ii 2 -ii^( fe+1 )-f*ii 2 ) (19) 

iev 

Sum ( fT9b over \K' , if] and rearrange it. It gives that 

if 

£||6(jr + i)_6f < £ ^a(fc) 2 ||i? 4 (fc)|| 2 

iev fe=K'-iev 

k=K' iev 

Since {£i(fc)} converges, it is uniformly bounded. Recall that 
{a(k)} is not summable but square summable. When K is 
sufficiently large, the above inequality leads to a contradiction. 
Hence, it must be that d* A - Ne < Q{£). ■ 

The remainder of this section is dedicated to characterizing 
the convergence properties of primal estimates. Toward this 
end, we present the closedness and upper semicontinuity 
properties of f2^. 

Lemma 4.5 (Properties of Of): The approximate set- 
valued marginal map fif is closed. In addition, it is upper 
semicontinuous at G E^; i.e., for any e' > 0, there is 
8 > such that for any & G it holds that 



Proof: Consider sequences {xi(k)} and {£,i(k)} 
satisfying lim £i(fc) = £i, £i(fe) G Qf(£j(fe)) and 

fe— >+oo 

lim = 5j. Since £j is continuous, then we have 

k— > + oo 

£»(zi,&) = 1™ Ci(xi(k)^i(k)) 
< lim (Qi(fi(fe)) + c) = Qi(6) + c, 

where in the inequality we use the property of Xi(k) E 
f2f(£i(fe)), and in the last equality we use the continuity of 
Qi. Then Xi E Of(£i) and the closedness of fif follows. 

Note that ftf(£i) = H X. Recall that Q| is closed 

and X is compact. Then it is a result of Theorem 17.11 in the 
Appendix that f2f(£i) is upper semicontinuous at ^ E S<; 
i.e, for any neig hborhood W in 2 X of fif (&), there is J > 
such that Vfi G -Bat (&,#)> it holds that fif(|f) C W. Let 
W = £> 2 * (£i), anc l we obtain the property of upper 
semicontinuity at £j. ■ 

Upper semicontinuity of ensures that each accumulation 
point of {xi(k)} is a point in the set f2|(£,); i.e., the conver- 
gence of {xi(k)} to the set fif can be guaranteed. In what 
follows, we further characterize the convergence of {xi(k)} 
to a point in Of(£i) within a finite time. 

Lemma 4.6 (Primal estimate convergence): For each i £ 
V, there are a finite time Tj > and Xi E fif(£j) such that 
^(fe) = x< for all fc > T 4 + 1. 

Proo/- Choose e > and e > such that 2(G + 
4iJ + 2y/mS)e + 2e < e. Since Qi is continuous and 
lim ||uj(fc) — = 0, then there is Ki > such that for 

k— >-\-oo 

all fc > Ki, it holds that 

|||< - Bi(fc)|| < e, \\Qi(£i) - Qi(vi(k))\\ < e. (20) 



The time instant T{ > is defined as follows: if there is 
some finite time k > Ki + 1 such that Ci(xi(k), Vi(k + 1)) > 
Qi(vi(k + 1)) + e, then Tj is the smallest one among such fe; 
otherwise, Ti = K + 1. In what follows we prove that is 
the time in the statement of the lemma. 

Consider the first case of T,-. In this case, d(xi(Ti), Uj(T,- + 
1)) > Q<(«i(2i + 1)) + e, i.e., ^(T,) g ^(^(T, + 1)). Then 
x,(T 2 + 1) G ^(^(T 4 + 1)); i.e., C^x^T, + 1), w 4 (T 4 + 1)) = 
Qi(vi(Ti + 1)). By using this property, we have that for any 
fe > Ti + 1, it holds that 

\\Ci(xi(Ti + i),Vi(k))-Qi{ii)\\ 

< \\CiixiiTi + - Q,(^(T 4 + 1))|| 

+ ||0i(wi(2i + l))-Qi(Ci)|| 

= HA^iCTi + l),Wi(fc)) - A(x t (T j; + l),Wi(T, + 1))|| 
+ ||<9iK(T i + l))-Q i (|' i )||. (21) 



7 



Notice that the term \\d(xi(Ti + l),Vi(k)) - C i (x l (T l + 
l),Vi(Ti + 1))|| can be upper bounded in the following way: 

114(3,(7} + 1), Vi(k)) - Ci{xi(Ti + l),Vi(Ti + 1))|| 

+ (-«£(*)< + v\(k) iv + v\(Ti + 1), - vi(Ti + l) iff) 
XiiTi + l^ + ivMi-vKk)^ 

- vl(T t + l)i + vi(Ti + l^XiM + 1)> 

- («!(*)< - «lm + l)i, A) - (^(fc), - <(T, + 1),, A) || 
< 2(G + 4# + 2 v / 77i<5)e. (22) 

Substituting (O and d22]i into (flB gives that 

||A(x l (T J + i),« 2 (fc))-g 4 (e i )ll 

< 2(G + 4ff + 2Vm<5)e + e. (23) 
This implies that for any k > Ti + 1, it holds that 

< AMT, + l),«i(fc)) - Qi(vi(k)) 

< IIAfo^ + l), «<(*)) -Q^fOII 
+ \\Qi(ii)-Qi(vi(k))\\ 

< 2(G + 4/7 + 2V^<5)e + 2e < e. 

Hence, we conclude that Xi(Ti + 1) £ f2|(iij(fc)) for all fc > 

+ 1, and thus Xl {k) = Xi{% + 1) for all k > 7} + 1. 

We now consider the second possibility for 7}. In this case, 
Ci(xi(k),Vi(k + 1)) < Q l (v l (k + 1)) + e for all k > % = 
Ki + 1. Therefore, we have ^(T, + 1) £ 0|(uj(A;)) and then 
Xi (fc) = Xl (T, + 1) for all k>Ti + l. 

In both cases, the chosen finite 7} > guarantees that for 
aU fc > Ti + 1, Xi{k) = Xi(Ti + l) and.T,(fc) £ ^(^(Tj + l)). 
Upper semicontinuity of 0| ensures Xj(7i + 1) e £lf (£.;). ■ 

Now we are ready to show the main result of this paper, 
Theorem 13. II In particular, we will show the property of com- 
plementary slackness, primal feasibility of x, and characterize 
its primal suboptimality. 

Proof for Theorem I3.lt 

Claim 1: {-A-Xi+x iD , A») = 0, (-A+x,-x ir)) = 
and (g(xi),fii) = 0. 

Proof: Rearranging the terms related to A in ( TBl i leads 
to the following inequality holding for any ((//,), A, w) £ 3 
with (fii,X, w) £ Mi for all i £ V: 



Sum dH over [0,7C], divide by s(K) := ]T" =0 1111(1 
we have 



- 2a(*)«-A - x^fe), «!(*)* - A,-) 
iev 

+ (»i D (fe),t;jf (fc)< - A,)) < a(k) 2 Y, \\Vi(k)\\ 2 

iev 

+£(ii&w-&iiMi^+i)-&ii 2 ) 

+ 2a(fc) 5^{(-aJi(jfe), <,(fe)w - to it ,) + (xi(fc) - A, 



K 



-i- £ a(k) £ 2 « A + Xi(k),v\(k)i - A,) 
s ^ ' k=o iev 

+ (-x in {k),vi°(k)i -Xi))< -i- f>(fc) 2 J2 llAWf 

+ ^{£(H&(°) - ^ ||2 - "^ (A ' + 1} - ^ ||2) 

^ ' iev 

A' 

- ^) + (x,(fc) - A, 

k=o iev 

vi(k)i - vii) + (-Xi(k), vHk)^ - w iv ))}. (25) 



We now proceed to show (—A — Xj + x iD ,Xi) > 
for each i £ V. Notice that we have shown that 
lim \\xi(k) — Xi\\ = for all i £ V, and it also holds that 

k— >+OG 

lim \\&(k) - = for all i £ V. Let A, ; = f\ t , A, = L 

k— > + oo 

for j 7^ i and /i,; = /ii, w = w in (f25t . Recall that {a(fc)} is 
not summable but square summable, and {T>i(k)} is uniformly 
bounded. Take K — > +oo, and then it follows from Lemma lT2l 
in the Appendix that: 



(A + x t - x iD , Xi) < 0. 



(26) 



iev 



v l w (k)i - Wi) + (g(xi(k)),Hi(k) - fit)}. 



(24) 



On the other hand, since £ £ D e A , we have ||£|j < 7 by ([8]). 
Then we could choose a sufficiently small d' > and £ 6 3 
in ( fZSb such that ||£|| < 7 + where 6* is given in the definition 
of Mi and £ is given by: A; = (1 + 5')Aj, Xj = Xj for j 7^ i, 
w = w, fi = jl. Following the same lines toward d26*l i. it 
gives that — S(A + Hi — Xi D) Xi) < 0. Hence, it holds that 
(—A — Hi + Xi D , Xi) — 0. The rest of the proof is analogous 
and thus omitted. ■ 

Claim 2: x is primal feasible to the problem (Pa)- 

Proof: We have known that Xi £ X. We proceed to show 
—A — Xi+Xi n < by contradiction. Since ||£|| < 7, we could 
choose a sufficiently small 6' > and £ with ||£|| < 7 + 9 
in ( |25T ) as follows: if (—A — ij + Xi D )i > 0, then (A,)^ = 
(Xi)e + S'; otherwise, (Aj)^ = (A,)^, and w = w, fj, = fi. The 
rest of the proofs is analogous to Claim 1. 

Similarly, one can show g(xj) < and — A + xi — x - lD < 
by applying analogous arguments. ■ 

Claim 3: It holds that p* A < J2 t ev fi&) < -Pa + Ne - 
Proof: Since x is primal feasible, then J2 i£ y fi(xi) > 
p* A . On the other hand, J^iev = J2 ie v AO^i, 6) < 



V. An illustrative example 

In this section, we examine a numerical example to illustrate 
the performance of our algorithm. Consider a network of four 
agents and let the objective functions of agents fi : IR>o — > R 



x 



be equal and defined as follows 

AGO - 




ze [0,1], 

zG [1,2], 
z G [2, +oo). 
ze [0,2], 
ze [2,3], 
z e [3, +oo). 

, / 4 (x) = (z - 



0.5) 2 



It is easy to verify that fi and fa are not convex and f 3 and 
/4 are convex. The primal problem of interest is given by: 



4 

mh $>(*), 

i=l 



s.t. z e x 



[0,10]. 



(27) 



The objective function of problem d27l i is piecewise convex, 
and it is not difficult to check that it has a unique solution 
2 = 1 and the optimal value is p* = || ps 1.2813. The 
associated approximate problem to d27l ) is then: 

4 

min YV^x*), 

z— 1 

s.t. Xi G X := [0,10], Viey, 



xi 


- ^2 < 5, 


X2 


- xi < 6, 


X2 


- X 3 < S, 


X 3 


- x 2 < 6, 


£3 


— < 5, 


X4 


- x 3 < S, 


X4 


— xi<S, 


Xl 


— X4 < 5, 



(28) 

where the scalar S = 1. It can be seen that for any value z E X, 
x = [z z z z] T is a Slater vector of problem d28l l and all the 
agents can agree upon the value z through the max-consensus 
algorithm within a finite number of iterations. Here, we choose 
z = 0.5. We further choose the tolerance level e = 0.1, and 
then compute 71 = 72 = f and 73 = °' 7 ^ +e , 74 = 4. 
Therefore, we have 7 = 4maxi £ y 7$ = 4 0,7 ^ +e = 2.65 and 
then choose 9 = 0.35 for the set Mi. 

We now proceed to check the strong duality of problem d28l) . 
To do this, we first define the Lagrangian function C as 
follows: 

C(x,0=J2Ui(xi) 

+ {Xl, — Xi + X2 — S) + {Wi, Xi — x% — S) 
+ (A 2 , —X2 + X3 - S) + (w 2 ,X2 —X3 — S) 
+ (A3, -x 3 + X4 - S) + (w 3 ,x 3 -X4-5) 
+ (A 4 , -X4 + xi - 5) + (1U4, X4 — xi — S)J , 

where £ := (Xi,Wi)i & v Th e dual function is given by Q(£) = 
inf a . eX 4 C(x,£). Notice that Q(0) = inf ieX 4 Y* ie v fi( x i) = 
1?. The primal and dual optimal values of problem (|28| ) are 
denoted by and d|, respectively. Note that p* s — 1| as 
1.0625 with [110 0.5] T being one of primal solutions and 
thus Q(0) = pj. This establishes that > Q(0) = p* s . On 
the other hand, it follows from weak duality that pg > d|. 
We now conclude that pg = dg and thus the duality gap of 
problem (l28l l is zero. 



Figure [T] and [2] show that the evolution of states Xj(fc) 
and the global objective function of problem d28l l. After 150 
iterations, states Xi(k) converge to 0.2436, 0, and 0.1509, 
respectively which consist of a feasible solution. Figure[2]indi- 
cates that J2t=i fi( x i(k)) converges to the value 1.1844 which 
is in the interval of [p* s - 4e, p* 5 + 4e] = [0.6625, 1.4625] 
and is a good approximation of p* and p* s . 

VI. Conclusion 

We have studied a multi-agent optimization problem where 
the goal of agents is to minimize a sum of local objective 
functions in the presence of a global inequality constraint and 
a global state constraint set. Objective and constraint functions 
as well as constraint sets are not necessarily convex. We 
have presented the distributed approximate dual subgradient 
algorithm which allow agents to asymptotically converge to 
a pair of approximate primal-dual solutions provided that the 
Slater's condition and strong duality property are satisfied. 

VII. Appendix 

A. Nonexpansion property of projection operators 

Lemma 7.1: J3| Let Z be a non-empty, closed and convex 
set in R™. For any z G M™, the following holds for any y G Z: 

\\p z [z]~y\\ 2 <\\z~yr~\\p z [z]-zr. 

B. A property of weighted sequence 

Lemma 7.2: ll29l Consider the sequence {S(k)} defined 



by S(k) 
), and 
lim 5(k) = p 



S?=o °(t)p(t) 



0, and J2k=o a ( k ) 



, where p(k) e K™, a(k) > 
f 00. If lim p(k) — p* , then 



C. Background on set-valued maps 

We let X and Y denote Hausdorff topological spaces. A 
set-valued map SI : X — > Y is a map that associates with 
any x G X a subset fl(x) of Y. The following definitions and 
theorem are adopted from (TJ. 

Definition 7.1: The set-valued map SI is closed at a point 
x e X if {x(k)} C X, lim dist(x(fc), x) = 0, y{k) G 

k— ^+00 

Vt(x(k)), and lim dist(y(fc), y) — implies that y e Sl(x). 

Definition 7.2: The set-valued map S7 is called upper semi- 
continuous at x e X if and only if any neighborhood U of 
Sl(x), there is 77 > such that Mx' G B(x,rj), it holds that 
Sl(x') CW. 

Theorem 7.1: Let S7 and II be two set- valued maps from X 
to Y. Assume that S7 is closed, II(x) is compact and II is upper 
semicontinuous at x G X. Then Sin II is upper semicontinuous 
at x. 

D. Dynamic average consensus algorithms 

The following is the vector version of the first-order dy- 
namic average consensus algorithm proposed in 1 30 1 : 



JV 



x l (k + 1) = ^a}(fc)x J (fc) + rf(fc), 



(29) 
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where x 1 (k) , if (k) € M. n . Denote Ar]e(k) := max^y ?^(fc) — 
miriigy rf t (k) for 1 < £ < n. 

Proposition 7.1: ll30l Let the periodic strong connectivity 
assumption 12.31 the non-degeneracy assumption 12.11 and the 
balanced communication assumption 12.21 hold. Assume that 
lim Ar)e(k) = for all 1 < I < n and all k > 0. Then 

k— >+OG 

the implementation of Algorithm d29| i achieves consensus, i.e., 
lim \\x l (k) - x j (k)\\ = for all i,j € V. 

k— >-\-oo 









— the trjectory of x 1 














— the trjectory of x 2 
the trjectory of x 3 
— the trjectory of x 4 


























































X: 15U 
Y: 0.2436 









X: 150 
Y: 0.150 



X: 150 
Y: 



Fig. 1, The evolution of states with the convergent vector of 
[0.2436 0.1509] T 







— the trajectory of global objective function! 
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X: 150 
Y: 1.184 



50 



150 



Fig. 2. The evolution of the global objective function along the trajectories 
of the system with the convergent value of 1.184. 
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